Syntactic N-grams as Features for the Author Profiling Task: Notebook for PAN at CLEF 2015

نویسندگان

  • Juan Pablo Posadas-Durán
  • Helena Gómez-Adorno
  • Ilia Markov
  • Grigori Sidorov
  • Ildar Z. Batyrshin
  • Alexander F. Gelbukh
  • Obdulia Pichardo-Lagunas
چکیده

This paper describes our approach to tackle the Author Profiling task at PAN 2015. Our method relies on syntactic features, such as syntactic based n-grams of various types in order to predict the age, gender and personality traits that has the author of a given text. In this paper, we describe the used features, the employed classification algorithm, and other general ideas concerning the experiments we conducted.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Author Verification Using Syntactic N-grams: Notebook for PAN at CLEF 2015

This paper describes our approach to tackle the Author Verification task at PAN 2015. Our method builds a representation of an author’s style by using the information contained in dependency trees. This information is represented as syntactic n-grams and used to conform a vector space. Using unsupervised machine learning approach, each instance is associated to the correponding author using the...

متن کامل

Topic Models and n-gram Language Models for Author Profiling - Notebook for PAN at CLEF 2015

Author profiling is the task of determining the attributes for a set of authors. This paper presents the design, approach, and results of our submission to the PAN 2015 Author Profiling Shared Task. Four corpora, each in a different language, were provided. Each corpus consisted of collections of tweets for a number of Twitter users whose gender, age and personality scores are know. The task wa...

متن کامل

ITALICA at PAN 2013: An Ensemble Learning Approach to Author Profiling Notebook for PAN at CLEF 2013

This notebook discusses the approach to the Author Profiling task developed by the Italica group for PAN 2013. This system implements two different sets of classifiers which are combined later in order to build a final classifier that takes into account the decisions of the previous ones. The initial classifiers are focused on vector space representations of the documents as a bag of words and ...

متن کامل

Readability for Author Profiling? Notebook for PAN at CLEF 2013

This paper briefly describes the approach taken to the Author Profiling task at PAN 13. It describes the simple features used, and the origins in thinking around text readability as a mechanism for identification, and the predictive model used which may have beneficially omitted classes, as well as offering commentary on the results obtained.

متن کامل

UniNE at CLEF 2015 Author Profiling: Notebook for PAN at CLEF 2015

This paper describes and evaluates an effective author profiling model called SPATIUM-L1. The suggested strategy can be adapted without any problem to different languages (such as Dutch, English, Italian, and Spanish) in Twitter tweets. As features, we suggest using the 200 most frequent terms of the query text (isolated words and punctuation symbols). Applying a simple distance measure and loo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015